Evaluation of per-record identification risk and swappability of records in a microdata set via decomposable models
نویسندگان
چکیده
We propose a strategy for disclosure risk evaluation and disclosure control of a microdata set based on fitting decomposable models of a multiway contingency table corresponding to the microdata set. By fitting decomposable models, we can evaluate per-record identification (or re-identification) risk of a microdata set. Furthermore we can easily determine swappability of risky records which does not disturb the set of marginals of the decomposable model. Use of decomposable models has been already considered in the existing literature. The contribution of this paper is to propose a systematic strategy to the problem of finding a model with a good fit, identifying risky records under the model, and then applying the swapping procedure to these records.
منابع مشابه
MATHEMATICAL ENGINEERING TECHNICAL REPORTS Evaluation of per-record identification risk and swappability of records in a microdata set via decomposable models
We propose a strategy for disclosure risk evaluation and disclosure control of a microdata set based on fitting decomposable models of a multiway contingency table corresponding to the microdata set. By fitting decomposable models, we can evaluate per-record identification (or re-identification) risk of a microdata set. Furthermore we can easily determine swappability of risky records which doe...
متن کاملConditions for swappability of records in a microdata set when some marginals are fixed
We consider swapping of two records in a microdata set for the purpose of disclosure control. We give some necessary and sufficient conditions that some observations can be swapped between two records under the restriction that a given set of marginals are fixed. We also give an algorithm to find another record for swapping if one wants to swap out some observations from a particular record. Ou...
متن کاملEvaluation of per-record identification risk by additive modeling of interaction for contingency table cell probabilities
We propose to fit a Lancaster-type additive model of interaction terms for cell probabilities of contingency tables to evaluate the conditional probability of population uniqueness of sample unique records in microdata sets. Moment estimation of the Lancaster-type additive model is straightforward and the proposed estimation procedure is intuitively appealing from the viewpoint of disclosure ri...
متن کاملPreserving Edits When Perturbing Microdata for Statistical Disclosure Control Ntalie Shlomo, Ton De Waal
To protect individuals in microdata from the risk of re-identification, a general perturbative method called PRAM (the Post-Randomization Method) is sometimes used for masking records. This method adds “noise” to categorical variables by changing values of categories for a small number of records according to a prescribed probability matrix and a stochastic process based on the outcome of a ran...
متن کاملDisclosure risk assessment in statistical microdata protection via advanced record linkage
The performance of Statistical Disclosure Control (SDC) methods for microdata (also called masking methods) is measured in terms of the utility and the disclosure risk associated to the protected microdata set. Empirical disclosure risk assessment based on record linkage stands out as a realistic and practical disclosure risk assessment methodology which is applicable to every conceivable maski...
متن کامل